Skip to content

fix(decisioning): project malformed-JSON 2xx responses to AdcpError#836

Merged
bokelley merged 1 commit into
mainfrom
claude/issue-453-json-decode-error-recut
May 26, 2026
Merged

fix(decisioning): project malformed-JSON 2xx responses to AdcpError#836
bokelley merged 1 commit into
mainfrom
claude/issue-453-json-decode-error-recut

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Refs #453

When an upstream returns a successful (2xx) status with a non-JSON body (e.g. a CDN or proxy returning an HTML error page), UpstreamHttpClient._request previously propagated a raw json.JSONDecodeError from response.json(). Adopters catch AdcpError throughout; an unhandled ValueError subclass breaks that contract and is invisible in typed code. This PR wraps the call in except ValueError and re-raises as AdcpError(SERVICE_UNAVAILABLE, recovery="transient") — the same code used for 5xx responses, since an upstream returning HTML for a JSON endpoint is indistinguishable from a transient infrastructure failure.

This is a re-cut of the original branch from #767 (claude/issue-453-json-decode-error-handling), which was behind main and had maintainer edits disabled. The diff is identical to #767 but rebased cleanly on current main.

What changed

  • src/adcp/decisioning/upstream.py: wrap response.json() in try/except ValueError; raise AdcpError(SERVICE_UNAVAILABLE) with recovery="transient" and from exc chain.
  • tests/test_upstream_helpers.py: add test_200_with_malformed_json_raises_service_unavailable — mocks a 200 with an HTML body and asserts code, recovery, and __cause__.

What tested

  • pytest tests/test_upstream_helpers.py -q35 passed, 0 failed
  • mypy src/adcp/decisioning/upstream.py — clean
  • ruff check src/adcp/decisioning/upstream.py tests/test_upstream_helpers.py — clean

Pre-PR review

  • code-reviewer: approved — flagged missing @pytest.mark.asyncio (false positive: asyncio_mode = "auto" in pyproject.toml); one nit on truncating exc in message (low severity, AdcpError is internal)
  • dx-expert: approved — SERVICE_UNAVAILABLE / transient is correct for CDN/proxy interception; nit on adding bounded response.text[:200] snippet to message; no blockers

Nits (not fixed):

  • Error message embeds raw exc text rather than a body-length hint; useful for debugging, acceptable for an internal error type
  • recovery="transient" is correct per the spec but can't distinguish "CDN blip" from "permanently misconfigured upstream" — no better enum value exists

Triage-managed PR. This bot does not currently iterate on
review comments or PR conversation threads (only on the source
issue). To unblock:

  • Push fixup commits directly: gh pr checkout <num>
    fix → push.
  • Or re-trigger: comment /triage execute on the source
    issue.

See adcp#3121
for context.

Session: https://claude.ai/code/session_018wq8sxhkG9c2WW4vgpYgS8


Generated by Claude Code

When an upstream returns a 2xx status with a non-JSON body (e.g. a CDN
or proxy returning an HTML error page), re-raise json.JSONDecodeError as
AdcpError(SERVICE_UNAVAILABLE, recovery="transient") so adopters get a
typed error consistent with the rest of the error-projection surface.

Refs #453

https://claude.ai/code/session_018wq8sxhkG9c2WW4vgpYgS8
Copy link
Copy Markdown

@aao-ipr-bot aao-ipr-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Right shape — mirrors the existing 5xx→SERVICE_UNAVAILABLE/transient projection at upstream.py:139-144 exactly, so the typed-error contract holds end-to-end. Adopters catching AdcpError no longer leak a ValueError subclass on the happy-path return.

Things I checked

  • AdcpError("SERVICE_UNAVAILABLE", message=..., recovery="transient") call shape matches the constructor at types.py:109-128 and the prior art at upstream.py:139-144. from exc chains the JSONDecodeError into __cause__ — test asserts it at tests/test_upstream_helpers.py:443.
  • except ValueError is the right catch: httpx.Response.json() documents ValueError; json.JSONDecodeError is a subclass. Tighter than catching Exception, looser than catching the subclass directly — correct trade-off.
  • Only one real response.json() call site in src/adcp/decisioning/ (upstream.py:286). The match in property_list.py:57 is docstring example code. No missed sites — no consistency leak.
  • 3.0 error-code enum has no UPSTREAM_ERROR / INTERNAL_ERROR / GATEWAY_* member. SERVICE_UNAVAILABLE is the only spec-conformant projection for a 2xx-with-non-JSON-body in 3.0. INVALID_REQUEST would falsely blame the buyer.
  • Message embeds path, method, status, and str(exc). JSONDecodeError.__str__ reveals position/line only — no body bytes, no creds. Not a leak.
  • CI: 5 Python versions, Postgres conformance, four storyboard runners — all green.

Follow-ups (non-blocking — file as issues)

  • 3.1 CONFIGURATION_ERROR/terminal projection. schemas/cache/3.1.0-beta.3/enums/error-code.json adds CONFIGURATION_ERROR (recovery: terminal) explicitly for "seller's deployment is misconfigured in a way that prevents handling the request — the buyer cannot fix it, retrying will not help." A CDN permanently serving HTML for a JSON endpoint is exactly that — transient will have buyers retry-with-backoff into a brick wall. When the dispatch pins to 3.1+, branch this projection to CONFIGURATION_ERROR. Worth a TODO; not a blocker for 3.0.
  • Message hygiene. _project_status at upstream.py:112-114 already has the right pattern: body_text[:200] snippet for the wire message, leaving parser internals out. The new branch instead concatenates str(exc) ("Expecting value: line 1 column 1 (char 0)"). 3.1's CONFIGURATION_ERROR description explicitly forbids stack traces / parser internals in error.message. Switch to a bounded response.text[:200] snippet — more useful for adopters debugging the CDN page, and forward-compat with the 3.1 message-hygiene rule. __cause__ keeps the decoder error for typed catches.

Minor nits (non-blocking)

  1. Test cleanup on assertion failure. tests/test_upstream_helpers.py:434-443 calls await client.aclose() after the asserts, so a failing assert leaks the client. The other tests in the file use the same pattern, so this is internally consistent — flag it the next time the file gets a sweep, not now.

Approving on the strength of the exact-mirror of the existing 5xx projection plus a test that pins the __cause__ invariant. A 30-line fix to a one-line bug — the kind of PR that should sail through.

@bokelley bokelley merged commit c547dfe into main May 26, 2026
24 checks passed
@bokelley bokelley deleted the claude/issue-453-json-decode-error-recut branch May 26, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants